Solving Goal Hybrid Markov Decision Processes Using Numeric Classical Planners

نویسنده

  • Florent Teichteil-Königsbuch
چکیده

We present the domain-independent HRFF algorithm, which solves goal-oriented HMDPs by incrementally aggregating plans generated by the Metric-FF planner into a policy defined over discrete and continuous state variables. HRFF takes into account non-monotonic state variables, and complex combinations of many discrete and continuous probability distributions. We introduce new data structures and algorithmic paradigms to deal with continuous state spaces: hybrid hierarchical hash tables, domain determinization based on dynamic domain sampling or on static computation of probability distributions’ modes, optimization settings under Metric-FF based on plan probability and length. We compare with HAO∗ on the Rover domain and show that HRFF outperforms HAO∗ by many order of magnitudes in terms of computation time and memory usage. We also experiment challenging and combinatorial HMDP versions of benchmarks from numeric classical planning, with continuous dead-ends and non-monotonic continuous state variables.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Policy Iteration with a Policy Language Bias (draft)

We explore approximate policy iteration (API), replacing the usual costfunction learning step with a learning step in policy space. We give policy-language biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve. In particular, we induce high-quality domain-specific planners for classical planning domains (both deterministic and...

متن کامل

Approximate Policy Iteration with a Policy Language Bias

We explore approximate policy iteration (API), replacing the usual costfunction learning step with a learning step in policy space. We give policy-language biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve. In particular, we induce high-quality domain-specific planners for classical planning domains (both deterministic and...

متن کامل

A Hierarchical Framework for Composing Nested Web Processes

Many of the previous methods for composing Web processes utilize either classical planning techniques such as hierarchical task networks (HTNs), or decision-theoretic planners such as Markov decision processes (MDPs). While offering a way to automatically compose a desired Web process, these techniques do not scale to large processes. In addition, classical planners assume away the uncertaintie...

متن کامل

Multi-Threaded BLAO* Algorithm

We present a heuristic search algorithm for solving goal based Markov decision processes (MDPs) named Multi-threaded BLAO* (MBLAO*). Hansen and Zilberstein proposed a heuristic search MDP solver named LAO* (Hansen & Zilberstein 2001). Bhuma and Goldsmith extended LAO* to the bidirectional case (Bhuma & Goldsmith 2003) and named their solver BLAO*. Recent experiments on BLAO* (Dai & Goldsmith 20...

متن کامل

Contingent Planning Under Uncertainty via Stochastic Satisfiability

We describe a new planning technique that efficiently solves probabilistic propositional contingent planning problems by converting them into instances of stochastic satisfiability (SSat) and solving these problems instead. We make fundamental contributions in two areas: the solution of SSat problems and the solution of stochastic planning problems. This is the first work extending the planning...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012